Skip to content

LOB phantom workaround: skip-lob-tables + clean test environment#18

Merged
rophy merged 9 commits intomasterfrom
lob-phantom-workaround
Apr 6, 2026
Merged

LOB phantom workaround: skip-lob-tables + clean test environment#18
rophy merged 9 commits intomasterfrom
lob-phantom-workaround

Conversation

@rophy
Copy link
Copy Markdown
Owner

@rophy rophy commented Apr 6, 2026

Summary

  • Add skip-lob-tables format option to skip DML events for tables with LOB columns
  • Document LOB phantom event root cause (investigation report)
  • Switch fuzz test to KafkaOffsetBackingStore with SCN pre-seeding for clean, reproducible test runs
  • Fix LOB replay validation in db-check

Changes

OLR

  • skip-lob-tables format option (Format.h, Builder.cpp, OpenLogReplicator.cpp): When set to 1, silently skips INSERT/UPDATE/DELETE for tables with CLOB/BLOB/NCLOB columns. Covers all 5 emit/skip decision points in Builder.cpp.
  • Documentation: 4.format.adoc updated with new option; LOB-PHANTOM-INVESTIGATION-REPORT.md documents root cause and signals investigated.

Fuzz test framework

  • KafkaOffsetBackingStore: Both Debezium connectors switched from file-based to Kafka-based offset storage. action_up now gets the current SCN after table deployment and pre-seeds offset topics, ensuring no stale events from previous runs.
  • snapshot.mode=recovery: Rebuilds schema history from the pre-seeded offset without re-snapshotting data.
  • db-check fixes: Final-state replay for LOB table validation; compare row existence instead of final op type.
  • Documentation: FUZZ-TEST.md and AGENTS.md updated with clean run procedure.

Test results

Non-LOB accuracy (SKIP_LOB=1, 10-minute RAC fuzz):

validate:  69,359 events, 0 mismatches
db-check:  41,641 rows — DB vs LM: 0 diffs, DB vs OLR: 0 diffs

Redo log regression: 161 passed.

Related

Test plan

  • make test-redo — 161 passed
  • SKIP_LOB=1 ./fuzz-test.sh run 10 + validate + db-check — 100% match
  • skip-lob-tables=1 verified: 0 new LOB events from OLR

Summary by CodeRabbit

  • New Features

    • Added configuration option to skip emitting DML for tables with LOB columns.
  • Documentation

    • Added LOB phantom investigation report for Oracle RAC.
    • Expanded fuzz-test and agent docs with prerequisites, explicit lifecycle steps, and a required clean down-before-up workflow.
  • Tests

    • Added end-to-end 3-way DB-check validation with sentinel-based sync and seeded offsets.
    • Validator now uses final-state replay for LOB comparison; test infra switched to Kafka-backed offsets and host-mode Kafka networking.

rophy added 6 commits April 6, 2026 03:23
Replace seq-by-seq comparison with replay_final_state() for LOB tables.
Replays INSERT/UPDATE/DELETE ops into a final row state per event_id,
then compares end results. This handles:
- LogMiner merging INSERT + LOB_WRITE into single record (L2)
- OLR emitting extra/fewer intermediate events (phantom undo #15)
- Seq number shifting from misaligned event counts

Non-LOB tables remain seq-by-seq (100% accurate, no merge issues).

LOB known issues: 60 → 18 (irreducible phantom commits).
LogMiner merges INSERT + LOB_WRITE into a single INSERT (L2), while
OLR emits them separately (INSERT then UPDATE). Comparing final op
type (INSERT vs UPDATE) produces false diffs. Compare row_exists
(bool) instead — both mean "row exists with these values".

None values are NOT skipped — per Debezium docs, None means genuinely
NULL while __debezium_unavailable_value means unchanged. These are
semantically different.
Comprehensive analysis of phantom LOB events on Oracle RAC.
OLR emits ~0.3% extra rows on LOB tables that don't exist in the
database — from Oracle's internal LOB segment management.

Investigated dataObj, bdba+slot, suppLogBdba, and opcode-based
discrimination. None reliably distinguish internal LOB ops from
user DML in raw redo. Only COMMITTED_DATA_ONLY (LogMiner) can
filter them using Oracle-internal state.

Non-LOB tables: 100% accurate across all test runs.
Allows users to skip emitting DML events for tables with LOB columns
(CLOB/BLOB/NCLOB). Oracle's internal LOB segment management generates
redo records indistinguishable from user DML, causing phantom events
on RAC systems. This option lets users replicate LOB tables through
a separate mechanism while using OLR for non-LOB tables.
The initial implementation only covered the multi-row (Multiple) code
paths. Single-row INSERT and DELETE in Builder.cpp also need the check.
Verified: OLR emits 0 FUZZ_LOB events with skip-lob-tables=1.
Switch both Debezium connectors from FileOffsetBackingStore (binary)
to KafkaOffsetBackingStore. The `up` action now gets the current SCN
from Oracle after table deployment and pre-seeds Kafka offset topics,
ensuring both connectors start from exactly that SCN with no stale
events from previous runs.

- Debezium configs: KafkaOffsetBackingStore + snapshot.mode=recovery
- fuzz-test.sh: seed offsets after Kafka ready, before Debezium starts
- docker-compose: remove hostname for Kafka (incompatible with host network)
- FUZZ-TEST.md: document clean run procedure (down/up/run/validate/db-check)
- AGENTS.md: reference fuzz test documentation
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 6, 2026

📝 Walkthrough

Walkthrough

Adds a skip-lob-tables format option, investigates Oracle RAC LOB-phantom CDC behavior, and updates RAC fuzz-test tooling (seeded offsets, sentinel-based final-state validation, db-check) plus validator logic to support LOB final-state comparison and an opt-out to skip emitting DML for LOB-containing tables.

Changes

Cohort / File(s) Summary
Docs & Guides
AGENTS.md, documentation/LOB-PHANTOM-INVESTIGATION-REPORT.md, documentation/json/4.format.adoc, tests/dbz-twin/rac/FUZZ-TEST.md
Added LOB-phantom investigation report; documented skip-lob-tables config; updated FUZZ-TEST workflow, quick start, and required downup sequence.
Format API & Parsing
src/common/Format.h, src/OpenLogReplicator.cpp
Added Format::skipLobTables member and JSON parsing/validation for format.skip-lob-tables (accepts 0/1; 1 → true; other values error); propagates setting into runtime Format.
DML Emission Gate
src/builder/Builder.cpp
Gated processInsert/processUpdate/processDelete calls to skip emitting DML when format.skipLobTables is enabled and the resolved table contains LOB columns.
Fuzz infra — orchestration & offsets
tests/dbz-twin/rac/docker-compose-fuzz.yaml, tests/dbz-twin/rac/fuzz-test.sh
Switched Kafka to host networking; added _seed_debezium_offsets() to pre-seed compacted Kafka offset topics; up now brings up Kafka, seeds offsets from current SCN, then starts connectors; added sentinel insertion and db-check action.
Debezium connector configs
tests/dbz-twin/rac/config/application-logminer-kafka.properties, tests/dbz-twin/rac/config/application-olr-kafka.properties
Changed snapshot mode to recovery; migrated offset backing store from file to Kafka (KafkaOffsetBackingStore) with dedicated offset topics and bootstrap server settings.
Validator & final-state check
tests/dbz-twin/rac/validator.py, tests/dbz-twin/rac/db-check.py
Replaced LOB merge with replay_final_state() and added LOB final-state comparison path; added new db-check.py to wait for sentinel, replay CDC to final state, query Oracle ground truth, and report matched/missing/extra/diffs.
Tests & Scripts
tests/dbz-twin/rac/*
Multiple test/tooling updates to support seeded offsets, sentinel-based validation, db-check polling, and SKIP_LOB invocation in workloads.

Sequence Diagram(s)

sequenceDiagram
  participant User as "Operator"
  participant Compose as "docker compose"
  participant Kafka as "Kafka (host)"
  participant Debezium as "Debezium connectors (LM/OLR)"
  participant Oracle as "Oracle RAC"
  participant Consumer as "fuzz-consumer (SQLite)"
  participant DBCheck as "db-check.py"

  rect rgba(0,128,255,0.5)
  User->>Compose: fuzz-test.sh up
  Compose->>Kafka: start Kafka
  end

  rect rgba(0,200,100,0.5)
  Kafka->>User: ready
  User->>Compose: _seed_debezium_offsets(current SCN)
  Compose->>Debezium: start connectors (seeded offsets)
  Debezium->>Oracle: read redo (from seeded SCN)
  Debezium->>Kafka: publish CDC events
  Kafka->>Consumer: deliver events -> write SQLite
  end

  rect rgba(255,128,0,0.5)
  User->>Compose: fuzz-test.sh run <N>
  User->>Oracle: run workload (includes sentinel row)
  Oracle->>Debezium: redo stream contains sentinel
  Debezium->>Kafka: stream sentinel event
  Kafka->>Consumer: consumer persists sentinel to SQLite
  end

  rect rgba(200,0,150,0.5)
  User->>DBCheck: fuzz-test.sh db-check
  DBCheck->>Consumer: copy SQLite, wait for sentinel
  DBCheck->>Oracle: query ground-truth final rows
  DBCheck->>User: report matched / missing / extra / diffs
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Possibly related PRs

Poem

🐰 I seeded offsets at break of dawn,
I chased phantom LOBs until they were gone.
A sentinel planted, SQLite sings,
db-check counts truth and fixes the strings.
Hoppity — tests pass, and joytime begins!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the two main changes: a new skip-lob-tables configuration option and test environment improvements for clean runs, both directly reflected in the changeset files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch lob-phantom-workaround

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
src/OpenLogReplicator.cpp (1)

779-785: Prefer constructor-initializing Format with skipLobTables.

Format is now built in a partially initialized state and then mutated immediately after. Threading this flag through the constructor keeps the config object fully initialized at creation sites and makes future call sites harder to miss.

♻️ Proposed cleanup
-            Format format(dbFormat, attributesFormat, intervalDtsFormat, intervalYtmFormat, messageFormat, ridFormat, redoThreadFormat, xidFormat,
-                timestampFormat, timestampMetadataFormat, timestampTzFormat, timestampType, charFormat, scnFormat, scnType, unknownFormat,
-                schemaFormat, columnFormat, unknownType, userType, jsonNumberType, charsetOverrideId);
-            format.skipLobTables = skipLobTables;
+            Format format(dbFormat, attributesFormat, intervalDtsFormat, intervalYtmFormat, messageFormat, ridFormat, redoThreadFormat, xidFormat,
+                timestampFormat, timestampMetadataFormat, timestampTzFormat, timestampType, charFormat, scnFormat, scnType, unknownFormat,
+                schemaFormat, columnFormat, unknownType, userType, jsonNumberType, skipLobTables, charsetOverrideId);
-               USER_TYPE newUserType, JSON_NUMBER_TYPE newJsonNumberType = JSON_NUMBER_TYPE::AS_NULL, uint64_t newCharsetOverrideId = 0):
+               USER_TYPE newUserType, JSON_NUMBER_TYPE newJsonNumberType = JSON_NUMBER_TYPE::AS_NULL,
+               bool newSkipLobTables = false, uint64_t newCharsetOverrideId = 0):
                 ...
                 userType(newUserType),
                 jsonNumberType(newJsonNumberType),
+                skipLobTables(newSkipLobTables),
                 charsetOverrideId(newCharsetOverrideId) {}

Also applies to: 901-905

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/OpenLogReplicator.cpp` around lines 779 - 785, The code reads
"skip-lob-tables" into a local bool and mutates the Format object after
construction; instead, add a skipLobTables parameter to Format's constructor and
pass the computed value at construction time (keep the validation using
Ctx::getJsonFieldU and the same ConfigurationException check), update all Format
construction sites (including the other occurrence around lines 901-905) to
supply the new argument, and update Format's constructor signature/initializers
so the member is set via initializer list rather than via post-construction
mutation.
tests/dbz-twin/rac/fuzz-test.sh (1)

81-108: Robust offset seeding function, but topic creation error handling is opaque.

The function correctly seeds Debezium offsets with the current SCN. However, suppressing all stderr with 2>/dev/null on both topic creation (line 97) and producer (line 104) makes debugging difficult if something goes wrong.

Consider at least logging a debug message or checking the exit code for the producer command, since a failed seed would cause connectors to start from the wrong position.

Optional: Add minimal error checking for the producer
         # Produce seed offset
-        echo "${offset_key}|${offset_val}" | docker exec -i fuzz-kafka /opt/kafka/bin/kafka-console-producer.sh \
+        if ! echo "${offset_key}|${offset_val}" | docker exec -i fuzz-kafka /opt/kafka/bin/kafka-console-producer.sh \
             --bootstrap-server localhost:9092 \
             --topic "$offset_topic" \
             --property parse.key=true \
-            --property key.separator='|' 2>/dev/null
+            --property key.separator='|' 2>/dev/null; then
+            echo "WARNING: Failed to seed $offset_topic" >&2
+        fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/dbz-twin/rac/fuzz-test.sh` around lines 81 - 108, The
_seed_debezium_offsets function hides failures by redirecting stderr to
/dev/null for topic creation and the kafka-console-producer; update it to
capture and check the exit status of both docker exec commands and surface
errors: run the kafka-topics.sh and kafka-console-producer.sh calls without
swallowing stderr (or redirect stderr to a temp var), then if either command
returns non-zero, echo a clear error message including the topic name,
offset_key and offset_val (or captured stderr) and exit or return a non-zero
status so callers can detect failure; specifically modify the docker exec calls
inside _seed_debezium_offsets to preserve stderr, capture their exit codes, and
add conditional logging/exit when those codes indicate failure.
tests/dbz-twin/rac/validator.py (1)

284-303: LOB comparison silently counts differences without diagnostic output.

For LOB tables, when lm_exists != olr_exists (line 292-294) or when value diffs exist (line 298-299), the code increments total_lob_known but doesn't log the specific event_id or nature of the difference. This makes it harder to investigate which specific LOB events are phantom.

Consider adding optional verbose logging for LOB differences to aid debugging.

Optional: Add diagnostic logging for LOB differences
                     if lm_exists != olr_exists:
+                        print(f"[LOB_KNOWN] {eid} existence mismatch: "
+                              f"LM={lm_exists} OLR={olr_exists}", flush=True)
                         total_lob_known += 1
                         total_validated += 1
                     else:
                         diffs = compare_values(lm_state, olr_state,
                                                event_table, 'after')
                         if diffs:
+                            print(f"[LOB_KNOWN] {eid} value diffs", flush=True)
                             total_lob_known += 1
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/dbz-twin/rac/validator.py` around lines 284 - 303, When handling LOB
tables in the block using replay_final_state and compare_values, add optional
verbose diagnostic logging when lm_exists != olr_exists or when diffs is
non-empty: log the event identifier (event_id), the lm_exists/olr_exists
booleans, a short summary or count of diffs from compare_values and the
lm_state/olr_state snapshots (or a trimmed diff) as well as the lm_recs/olr_recs
context; gate this output behind an existing verbose/debug flag or logger so it
only prints when enabled. Ensure you modify the branch where lm_exists !=
olr_exists and the branch where diffs is truthy (references: replay_final_state,
compare_values, lm_recs, olr_recs, lm_state, olr_state, total_lob_known) to emit
the diagnostic messages.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/builder/Builder.cpp`:
- Around line 806-809: The LOB-skip check (format.skipLobTables &&
!table->lobs.empty()) must be hoisted out of the left-hand branch so it always
prevents emitting DML for known LOB tables regardless of the fallback OR arms;
update the conditional around the insert/row-emission guard (the compound if
using schema, table, DbTable::isSystemTable, DbTable::isDebugTable,
table->matchesCondition and
ctx->isFlagSet(Ctx::REDO_FLAGS::SHOW_SYSTEM_TRANSACTIONS/SCHEMALESS)) to first
bail out when format.skipLobTables && !table->lobs.empty(), then evaluate the
rest of the predicate, and apply the same restructuring to the analogous
delete/update and multi-row guards referenced in the comment (the other
occurrences at the noted locations) so SHOW_SYSTEM_TRANSACTIONS or SCHEMALESS no
longer override the LOB skip and counted rows are correctly skipped.

In `@tests/dbz-twin/rac/fuzz-test.sh`:
- Around line 461-475: The action_db_check() function invokes a missing script
tests/dbz-twin/rac/db-check.py via python3 "$SCRIPT_DIR/db-check.py", which will
cause runtime failure; either add and commit the db-check.py script implementing
the expected polling-and-sentinel-check behavior (reading SQLITE_DB path passed
via environment and re-copying the container DB), or change action_db_check() to
call an existing, committed validation utility or inline the check logic; update
references to SQLITE_DB, ORACLE_HOST and ensure SCRIPT_DIR points to the script
location so python3 "$SCRIPT_DIR/db-check.py" can execute successfully.

---

Nitpick comments:
In `@src/OpenLogReplicator.cpp`:
- Around line 779-785: The code reads "skip-lob-tables" into a local bool and
mutates the Format object after construction; instead, add a skipLobTables
parameter to Format's constructor and pass the computed value at construction
time (keep the validation using Ctx::getJsonFieldU and the same
ConfigurationException check), update all Format construction sites (including
the other occurrence around lines 901-905) to supply the new argument, and
update Format's constructor signature/initializers so the member is set via
initializer list rather than via post-construction mutation.

In `@tests/dbz-twin/rac/fuzz-test.sh`:
- Around line 81-108: The _seed_debezium_offsets function hides failures by
redirecting stderr to /dev/null for topic creation and the
kafka-console-producer; update it to capture and check the exit status of both
docker exec commands and surface errors: run the kafka-topics.sh and
kafka-console-producer.sh calls without swallowing stderr (or redirect stderr to
a temp var), then if either command returns non-zero, echo a clear error message
including the topic name, offset_key and offset_val (or captured stderr) and
exit or return a non-zero status so callers can detect failure; specifically
modify the docker exec calls inside _seed_debezium_offsets to preserve stderr,
capture their exit codes, and add conditional logging/exit when those codes
indicate failure.

In `@tests/dbz-twin/rac/validator.py`:
- Around line 284-303: When handling LOB tables in the block using
replay_final_state and compare_values, add optional verbose diagnostic logging
when lm_exists != olr_exists or when diffs is non-empty: log the event
identifier (event_id), the lm_exists/olr_exists booleans, a short summary or
count of diffs from compare_values and the lm_state/olr_state snapshots (or a
trimmed diff) as well as the lm_recs/olr_recs context; gate this output behind
an existing verbose/debug flag or logger so it only prints when enabled. Ensure
you modify the branch where lm_exists != olr_exists and the branch where diffs
is truthy (references: replay_final_state, compare_values, lm_recs, olr_recs,
lm_state, olr_state, total_lob_known) to emit the diagnostic messages.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6d745ab7-d85f-45f3-91a4-7aefd4325607

📥 Commits

Reviewing files that changed from the base of the PR and between 7f4b893 and 3a00252.

📒 Files selected for processing (12)
  • AGENTS.md
  • documentation/LOB-PHANTOM-INVESTIGATION-REPORT.md
  • documentation/json/4.format.adoc
  • src/OpenLogReplicator.cpp
  • src/builder/Builder.cpp
  • src/common/Format.h
  • tests/dbz-twin/rac/FUZZ-TEST.md
  • tests/dbz-twin/rac/config/application-logminer-kafka.properties
  • tests/dbz-twin/rac/config/application-olr-kafka.properties
  • tests/dbz-twin/rac/docker-compose-fuzz.yaml
  • tests/dbz-twin/rac/fuzz-test.sh
  • tests/dbz-twin/rac/validator.py

Comment thread src/builder/Builder.cpp Outdated
Comment thread tests/dbz-twin/rac/fuzz-test.sh
- Hoist skip-lob-tables check outside the || fallback path at all 5
  emit/skip decision points in Builder.cpp. Previously SHOW_SYSTEM_TRANSACTIONS
  or SCHEMALESS flags would bypass the skip and still emit LOB table events.
- Add missing db-check.py to the repository (was untracked).
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
tests/dbz-twin/rac/db-check.py (1)

314-315: Preserve falsy values in diff previews.

0 and empty strings are currently printed as 'None', which makes diff output misleading.

💡 Suggested fix
-        db_s = str(db_val)[:40] if db_val else 'None'
-        cdc_s = str(cdc_val)[:40] if cdc_val else 'None'
+        db_s = str(db_val)[:40] if db_val is not None else 'None'
+        cdc_s = str(cdc_val)[:40] if cdc_val is not None else 'None'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/dbz-twin/rac/db-check.py` around lines 314 - 315, The diff preview
treats falsy values (0, empty string) as None because the ternary uses
truthiness; update the conditions for db_s and cdc_s to check for None
explicitly (use "if db_val is not None" and "if cdc_val is not None") so that 0
and '' are preserved, then convert to str and slice as before (reference
variables db_val, cdc_val and the preview vars db_s, cdc_s).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/dbz-twin/rac/db-check.py`:
- Around line 8-17: The docstring promises a --watermark-scn CLI flag and
WATERMARK_SCN env var but main() never reads them; update main() to parse a
--watermark-scn argument (use argparse) and to read
os.environ.get("WATERMARK_SCN") as a fallback, prefer the CLI value over the env
var, convert the value to an integer (SCN) and use that SCN instead of always
querying Oracle for the current SCN; ensure the code path that currently queries
Oracle for the SCN only runs when neither the CLI flag nor WATERMARK_SCN is
provided so behavior matches the documentation.
- Around line 217-224: The current per-table except block around cursor.execute
for table_name only logs a warning and continues, which can hide failures;
update the exception handling in that try/except to either raise the exception
or set the overall failure flag (any_issues=True) before continuing so a failed
SELECT does not produce a false “ALL MATCH”; specifically modify the except
Exception as e block that prints the warning for cursor.execute to also mark
any_issues=True (or re-raise) so the calling check reflects the error for
table_name instead of silently skipping it.
- Around line 90-101: refresh_db currently runs docker cp commands via
subprocess.run and ignores failures, allowing stale SQLite snapshots to persist;
update refresh_db to check each subprocess.run result (calls that copy fuzz.db,
fuzz.db-wal, fuzz.db-shm from container 'fuzz-consumer' to sqlite_path) and fail
fast on non-zero returncode by raising an exception or aborting (include the
stderr output in the error). Ensure you reference the subprocess.run invocations
in refresh_db and validate returncode for each call, handling errors immediately
rather than silently continuing.
- Around line 115-127: Replace the try/except block that calls sqlite3.connect
with a with-statement using sqlite3.connect(sqlite_path) as conn and keep
conn.row_factory set inside that context, then call has_sentinel, get_max_scn
and get_event_count (for 'lm_events' and 'olr_events') inside the with so the
connection is always closed; also replace the broad except Exception with a more
specific except sqlite3.Error (or multiple specific sqlite3 exceptions if
desired) to handle DB errors and still sleep/continue on failure.

---

Nitpick comments:
In `@tests/dbz-twin/rac/db-check.py`:
- Around line 314-315: The diff preview treats falsy values (0, empty string) as
None because the ternary uses truthiness; update the conditions for db_s and
cdc_s to check for None explicitly (use "if db_val is not None" and "if cdc_val
is not None") so that 0 and '' are preserved, then convert to str and slice as
before (reference variables db_val, cdc_val and the preview vars db_s, cdc_s).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8b416dda-2d62-4826-a245-38657e0f62b4

📥 Commits

Reviewing files that changed from the base of the PR and between 3a00252 and 0c5b989.

📒 Files selected for processing (2)
  • src/builder/Builder.cpp
  • tests/dbz-twin/rac/db-check.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/builder/Builder.cpp

Comment thread tests/dbz-twin/rac/db-check.py Outdated
Comment thread tests/dbz-twin/rac/db-check.py Outdated
Comment thread tests/dbz-twin/rac/db-check.py
Comment thread tests/dbz-twin/rac/db-check.py
- Remove stale docstring advertising unimplemented --watermark-scn flag
- Fail fast on docker cp failures in refresh_db() (main DB required, WAL/SHM optional)
- Use sqlite3 context manager and catch specific sqlite3.Error in polling loop
- Raise on Oracle query failures instead of silently skipping tables
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/dbz-twin/rac/db-check.py (1)

333-343: Use context manager for sqlite3 connection for consistency.

Unlike wait_for_sentinel which properly uses with sqlite3.connect(...) as conn:, this connection is manually opened and closed. If an exception occurs during replay_events, the connection won't be properly closed.

♻️ Suggested fix
     # Replay
-    conn = sqlite3.connect(sqlite_path)
-    conn.row_factory = sqlite3.Row
-
-    print("\n--- Replaying LM events ---")
-    lm_state = replay_events(conn, 'lm_events')
-    print(f"  LM replay rows: {len(lm_state)}")
-
-    print("\n--- Replaying OLR events ---")
-    olr_state = replay_events(conn, 'olr_events')
-    print(f"  OLR replay rows: {len(olr_state)}")
-    conn.close()
+    with sqlite3.connect(sqlite_path) as conn:
+        conn.row_factory = sqlite3.Row
+
+        print("\n--- Replaying LM events ---")
+        lm_state = replay_events(conn, 'lm_events')
+        print(f"  LM replay rows: {len(lm_state)}")
+
+        print("\n--- Replaying OLR events ---")
+        olr_state = replay_events(conn, 'olr_events')
+        print(f"  OLR replay rows: {len(olr_state)}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/dbz-twin/rac/db-check.py` around lines 333 - 343, Replace the manual
sqlite3 connection with a context manager to ensure the connection is always
closed on exceptions: use "with sqlite3.connect(sqlite_path) as conn:" (keep
setting conn.row_factory = sqlite3.Row) and move the calls to replay_events(...,
'lm_events') and replay_events(..., 'olr_events') and the print statements
inside that with-block so lm_state and olr_state are produced while the
connection is open and automatically closed afterward; reference symbols:
sqlite3.connect, conn.row_factory, replay_events, lm_state, olr_state.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/dbz-twin/rac/db-check.py`:
- Around line 140-189: In replay_events, the CDC table name is used without
normalization which can mismatch the uppercase TABLES and compare_states
lookups; update the assignment of table (row['table_name']) inside the
replay_events function to normalize to uppercase (e.g., set table =
row['table_name'].upper()) so keys stored in state use the same uppercase table
name as column normalization and the TABLES constant.

---

Nitpick comments:
In `@tests/dbz-twin/rac/db-check.py`:
- Around line 333-343: Replace the manual sqlite3 connection with a context
manager to ensure the connection is always closed on exceptions: use "with
sqlite3.connect(sqlite_path) as conn:" (keep setting conn.row_factory =
sqlite3.Row) and move the calls to replay_events(..., 'lm_events') and
replay_events(..., 'olr_events') and the print statements inside that with-block
so lm_state and olr_state are produced while the connection is open and
automatically closed afterward; reference symbols: sqlite3.connect,
conn.row_factory, replay_events, lm_state, olr_state.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f9853f60-b9f6-4b1e-9c37-2a75ba7a3c98

📥 Commits

Reviewing files that changed from the base of the PR and between 0c5b989 and 78bad02.

📒 Files selected for processing (1)
  • tests/dbz-twin/rac/db-check.py

Comment thread tests/dbz-twin/rac/db-check.py
Aligns with TABLES constant and Oracle query results which are uppercase.
Prevents silent mismatch if CDC events use mixed-case table names.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
tests/dbz-twin/rac/db-check.py (2)

333-343: Use context manager for SQLite connection in main().

Similar to the pattern used in wait_for_sentinel, wrap the connection in a context manager to ensure cleanup if an exception occurs during replay.

♻️ Proposed fix
     # Replay
-    conn = sqlite3.connect(sqlite_path)
-    conn.row_factory = sqlite3.Row
-
-    print("\n--- Replaying LM events ---")
-    lm_state = replay_events(conn, 'lm_events')
-    print(f"  LM replay rows: {len(lm_state)}")
-
-    print("\n--- Replaying OLR events ---")
-    olr_state = replay_events(conn, 'olr_events')
-    print(f"  OLR replay rows: {len(olr_state)}")
-    conn.close()
+    with sqlite3.connect(sqlite_path) as conn:
+        conn.row_factory = sqlite3.Row
+
+        print("\n--- Replaying LM events ---")
+        lm_state = replay_events(conn, 'lm_events')
+        print(f"  LM replay rows: {len(lm_state)}")
+
+        print("\n--- Replaying OLR events ---")
+        olr_state = replay_events(conn, 'olr_events')
+        print(f"  OLR replay rows: {len(olr_state)}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/dbz-twin/rac/db-check.py` around lines 333 - 343, The SQLite connection
in main currently uses sqlite3.connect(sqlite_path) and manually calls
conn.close(); change it to use a context manager so the connection is always
closed on exceptions: open the DB with "with sqlite3.connect(sqlite_path) as
conn:" (set conn.row_factory = sqlite3.Row inside the with block), call
replay_events(conn, 'lm_events') and replay_events(conn, 'olr_events') inside
that block, and remove the explicit conn.close() call; keep using the same
identifiers (sqlite3.connect, conn.row_factory, replay_events, 'lm_events',
'olr_events') so behavior is unchanged but cleanup is automatic, consistent with
wait_for_sentinel.

201-236: Use context manager for Oracle connection to ensure cleanup on exceptions.

If an exception occurs after oracledb.connect() but before conn.close() (e.g., during row processing at lines 219-233), the connection leaks.

♻️ Proposed fix using context manager
-    conn = oracledb.connect(dsn)
-    cursor = conn.cursor()
     state = {}
     failures = []
 
-    for table_name, pk_col, compare_cols in TABLES:
-        all_cols = list(dict.fromkeys([pk_col, 'EVENT_ID'] + compare_cols))
+    with oracledb.connect(dsn) as conn:
+        cursor = conn.cursor()
+
+        for table_name, pk_col, compare_cols in TABLES:
+            all_cols = list(dict.fromkeys([pk_col, 'EVENT_ID', *compare_cols]))
             ...
-    cursor.close()
-    conn.close()
+        cursor.close()
+
     if failures:
         raise RuntimeError(f"Oracle query failures: {failures}")
     return state
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/dbz-twin/rac/db-check.py` around lines 201 - 236, The code leaks the
Oracle connection/cursor if an exception occurs after oracledb.connect(); fix by
using context managers: replace manual conn = oracledb.connect(dsn) / cursor =
conn.cursor() and explicit close() calls with "with oracledb.connect(dsn) as
conn:" and nested "with conn.cursor() as cursor:" around the loop over TABLES so
both conn and cursor are automatically closed on exceptions; preserve the
existing try/except that captures query errors (failures.append((table_name,
str(e)))) inside the new with-block and keep building state using the same
col_names/row handling logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/dbz-twin/rac/db-check.py`:
- Around line 333-343: The SQLite connection in main currently uses
sqlite3.connect(sqlite_path) and manually calls conn.close(); change it to use a
context manager so the connection is always closed on exceptions: open the DB
with "with sqlite3.connect(sqlite_path) as conn:" (set conn.row_factory =
sqlite3.Row inside the with block), call replay_events(conn, 'lm_events') and
replay_events(conn, 'olr_events') inside that block, and remove the explicit
conn.close() call; keep using the same identifiers (sqlite3.connect,
conn.row_factory, replay_events, 'lm_events', 'olr_events') so behavior is
unchanged but cleanup is automatic, consistent with wait_for_sentinel.
- Around line 201-236: The code leaks the Oracle connection/cursor if an
exception occurs after oracledb.connect(); fix by using context managers:
replace manual conn = oracledb.connect(dsn) / cursor = conn.cursor() and
explicit close() calls with "with oracledb.connect(dsn) as conn:" and nested
"with conn.cursor() as cursor:" around the loop over TABLES so both conn and
cursor are automatically closed on exceptions; preserve the existing try/except
that captures query errors (failures.append((table_name, str(e)))) inside the
new with-block and keep building state using the same col_names/row handling
logic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 563ae4b4-37f8-4e40-a563-a8809338c4cd

📥 Commits

Reviewing files that changed from the base of the PR and between 78bad02 and d66f6fb.

📒 Files selected for processing (1)
  • tests/dbz-twin/rac/db-check.py

@rophy rophy merged commit 7087cc6 into master Apr 6, 2026
2 checks passed
rophy added a commit that referenced this pull request Apr 6, 2026
- Hoist skip-lob-tables check outside the || fallback path at all 5
  emit/skip decision points in Builder.cpp. Previously SHOW_SYSTEM_TRANSACTIONS
  or SCHEMALESS flags would bypass the skip and still emit LOB table events.
- Add missing db-check.py to the repository (was untracked).
@rophy rophy deleted the lob-phantom-workaround branch April 6, 2026 04:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant